All Questions
Tagged with hyperparameterdeep-learning
19 questions
1vote
0answers
44views
Hyper parameter tuning LSTM network on time series data
I am trying to train LSTM model (containing four LSTM layers (500 units each) and three droupouts and a fully connected output layer to do regression) on timeseries data. To start with, I tried to ...
1vote
0answers
19views
Why I am requiring tiny learning rate to overfit the model?
I am trying to train LSTM model on a timeseries data with 1.6 million records. I have taken window size of 200. Initially I tried to overfit the model (train data = test data) on tiny dataset (few ...
1vote
0answers
184views
Why is a neural network not doing better than multivariate linear regressions?
I am making neural networks of multiple targets, all using same training data. For some of these targets, multivariate linear regressions do a very good job, i.e. a strong linear relation exists ...
1vote
1answer
373views
Different results between hyperparameter optimisation and actual training/val values
If I want to do a hyperparameter optimisation on a dataset using e.g. hyperband or random search, I note that some of the models being randomly chosen seem to have rather good R2 scores, MSE etc. I ...
1vote
0answers
17views
Feature Selection - Comparing Performance of different size datasets
If I have training data X, with N features, and I do feature selection, and discover n of <...
0votes
3answers
4kviews
Why does hyperparameter tuning occur on validation dataset and not at the very beginning?
Despite doing/using it a few times, I'm still slightly confused by the use of a validation set for hyper parameter tuning. As far as I can tell, I choose a model, train it on training data, assess ...
3votes
1answer
2kviews
Estimating Length of Hyperband Trials in Advance
I would like to use the (Keras/Tensorflow) hyperband tuning algorithm more than the Keras random search, for instance, when testing hyperparameters. With random search I can set max trials and get a ...
6votes
1answer
1kviews
Why do BERT classification do worse with longer sequence length?
I've been experimenting using transformer networks like BERT for some simple classification tasks. My tasks are binary assignment, the datasets are relatively balanced, and the corpus are abstracts ...
2votes
2answers
4kviews
Why SVM gridsearch takes longer time?
I have a dataset of 5K records and 60 features focussed on binary classification. Please find my code below for SVM paramter tuning. It's running for a longer time than ...
1vote
1answer
1kviews
hypeparameters tuning neural network according to loss vs according to scoring function
During hyperparameters tuning we select a metric to measure performance of the model. Example of metrics : f1 score, precision, recall, AUC ... In general, for the training of neural networks, back-...
3votes
1answer
497views
How to tune parameters for Time Series Analysis, when forecasting is only dominated by one feature and error is not getting reduced?
I am trying to predict time series based on 150 features. When I plot correlation of these features, I am getting 20 features with more or less importance but every model I use, it is completely ...
6votes
3answers
522views
How to make it possible for a neural network to tune its own hyper parameters?
I am curious about what would happen to hyperparameters when they would be set by a neural network itself or by creating a neural network that encapsulates and influences the hyperparameters of the ...
4votes
2answers
243views
Benefits of using Deep Learning-specific hyperparameter optimization tools vs. sklearn?
There are quite a few library for hyperparameter optimization that are specific to Keras or other Deep Learning libraries, like Hyperas or Talos. My question is, what's the main benefit of using ...
41votes
6answers
12kviews
How to set the number of neurons and layers in neural networks
I am a beginner to neural networks and have had trouble grasping two concepts: How does one decide the number of middle layers a given neural network have? 1 vs. 10 or whatever. How does one decide ...
2votes
1answer
98views
How to think about prediction error that is not convex in hyperparameter, or over the course of training
Take the following case of a hyperparameter and prediction error: Imagine that the hyperparameter is a L2 penalty or a dropout rate -- something that we think that should have a single sweet spot -- ...